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[57] ABSTRACT 

An architecture for a very large scale integrated (VLSI) 
implenaentation of a finite imprise response (FIR) digitd 
filter having no multipliers and a coefficient space lim- 
ited to powers of two. The filter structure includes a 
data bus, a coefficient bus and a sum-in bus to each 
coefficient tap. Each tap has a coefficient and control 
word register which is loaded during an initialization 
phase of the filter. Multiplication is provided by a 
shifter which provides the correct power of two 
weighting of an input data sample. The weighted data 
sample at each tap is added to the output of the previous 
tap. This architecture results in a regular, modular 
structure which can be cascaded and which is program- 
mable for various data word lengths and coefficient 
spaces. 

16 Qaims, 7 Drawing Sheets 
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FIG.4A 

,SIGN OF COEFR 



ZERO COEFR 



zc 



■VALUE OF THE POWER OF COEFF 



FIG.4B 



VALUE OF COEFI: MULTIPLIED VALUE 

Zo S[7 6 5 4 3 2 1 o|z Z Z Z Z Z Z 

2-1 S zh 6 5 4 3 2 1 o[z Z Z Z Z Z 

2-2 S Z z|7 6 5 4 3 2 1 oU Z Z Z Z 

2-3 S Z Z Z|7 6 5 4 3 2 1 0\z Z Z Z 

2-4 S Z Z Z z| 7 6 5 4 3 2 1 oU Z Z 

2-5 S Z Z Z Z z[7 6 5 4 3 2 I 0]z Z 

2-6 S Z Z Z Z Z Z^T 6 5 4 3 2 I ojz, 

2-7 S Z Z Z Z Z Z zl7 6 5 4 3 2 10 



FOR POSITIVE COEFFICIENTS 
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caded plurality of basic sections, each of which is char- 

ARCHITECTURE FOR POWER OF TWO acterized by coefficient values of integer powers of 

COEFnCIENT FIR FILTER two's. The filter of this patent uses no multipliers and 

claims an operating speed several times faster than other 
CROSS REFERENCE TO RELATED 5 filters which utilize multipliers. 

APPLICATION The multiplierless FIR filter disclosed in this applica- 

This application is related to pending application Ser. tion has certain concepts which appear to be similar to 

No. 923.534, fded Oct. 27, 1986, entitled A MULTIP- those of U.S. Pat. No. 3,979,701 but there are important 

LIERLESS FIR DIGITAL FILTER WITH TWO differences. 

TO THE NTH POWER COEFFICIENTS, Amihai The filter described in U.S. Pat. No. 3,979,701 has 
Miron and David Koo, inventors, assigned to the as- two basic building blocks from which the filter is con- 
signee of the present application and which is incorpo- structed: Type 1 and Type 2. The Type 1 unit has only 
rated herein by reference as background information. coefficients with a value of 1 (see line 53 to line 56 of 

BACKGROUND OF THE INVENTION ,5 S^lTJIv^ °^ k^"" ^^Ti'^^^) T^^^ 

'•'has only an even number of delay elements and only 

1. Field of the Invention three coefficients, the center coefficient value of which 
This invention pertains to the field of non-recursive is always equal to I (see line 7 to line 12 of column 4 of 

digital filters used for digital signal processing and real U.S. Pat. No. 3,979,701). 
time digital video processing. In partictilar it pertains to 

an architectural realization, in the form of very large 20 SUMMARY OF THE INVENTION 

scale integrated (VLSI) circuits, of finite impulse re- The invention pertains to the architecture and VLSI 

sponse (FIR) fUters which do not require multipliers implementation of an FIR digital filter which contains 

and which have only coefficients of two to the Nth multipliers and in which the coefficient space is 
..pun* o< iM'lited to only powers of two. In conventional digital 

2. Description of the Pnor Ait 25 ^^^^^ ^^^^^^ coefficients are linearly quantized 
Among the different types of digital filters there has ^ two to the Nth power different levels. In multiplier- 

^'S^'Si ni J" cT' iilf?^ impulse respome f,,, j-^^,, gj^^ coefficients are non- 

(FIR) digital filters (also caUed traversal filter). The ^^^^ ^^^^ ^^^^^^^ ^^^^j ^ 

reason for this is that powerful and mature optimization ^ui^i, « „«^Uo, ^.L xt*u 

theories exist to aid ui the filter design. FIR filters can 30 "^^^ 'Z^^f' u h k vr? f 

easily be designed to approximate a prescribed mag- P^Jf * ^ultiphers may be replaced by shift registers 

nitude/frequency response to arbitrary accuracy wiUi ^^^'^^P^*'*^" Z «on-conventional design, 

an exactly linear phase characteristic. The non-recur- ™ filter architecture of the present invention 

sive FIR fdters contain only zeroes in the finite z-plane "^^^ * structure that is regular and modular. It utilizes 

and hence are always stable. These features make them 35 » structure m which three buses go into 

very attractive for most digital signal processing appU- ^^P» coefficient bus and the sum-in 

cations. brings the broadcast data sample to 

Finite impulse response (FIR) digital fdters are ^^^^ ^^P- coefficient bus contains the weighting 

widely used in digital signal processing, as well as in factor information. The sum-in bus brings the delayed 

real-time digital video processing. The conventional 40 output of the previous tap. Going out of each tap is the 

hardware realization of an FIR digital filter utilizes the sum-out bus, which is the output of each tap and which 

basic functional components of delay units, multipliers S^es to the sum-m input of the next tap. This regular, 

and adders. Among these basic functional components, modular architecture lends itself to cascading of filter 

multipliers are generally the most complex for hard- sections for larger filters. 

ware realization, and occupy large "real estate" area, 45 B&ch tap has a coefficient register which contains the 

which increases the cost of the filter. The cost of multi- coefficient and control word information for that tap. 

pliers in discrete component systems is high. From the This information is loaded in the initialization phase of 

point of view of VLSI chip design, the area occupied by filter operation. Each tap has a shifter which uses 

a multiplier on an IC filter chip is too large. Cost is not the coefficient for the correct power of two weighting, 

the only important factor, but the operational speed of a 50 The output of the shifter is the weighted data sample 

filter is even more significant in a variety of applica- which is latched in a pipelining latch. The output of the 

tions; for example, in real-time video processing and latch is added to the output of the previous tap by an 

other high speed digital signal processing. In the con- adder. The adder's output is delayed by one time unit 

ventional FIR digital filter, a high percentage of the and then passed on as the adder input of the next tap. 

propagation delay time is due to multipliers, which 55 The filter works in two phases, the initialization phase 

reduce the speed of the filter. Therefore, to improve the and the normal operation phase. In the initialization 

operational speed, reduce the cost and simplify the phase, the coefficients and control words are loaded for 

structural complexity for VLSI chip design, it is desir- each tap. The coefficient registers are shift registers 

able to eliminate time-consuming multipliers from digi- connected in a serial chain and the loading thereof is 

tal FIR filters. 60 serial. A non-destructive verification of the loading is 

Current technical literature includes numerous arti- accomplished by reading out serially the coefficients 

cles directed toward the reduction or elimination of and control words and reloading them back in a closed 

multipliers in the architecture or design of FIR digital loop simultaneously so that at the end of the verification 

filters, while at the same time proposing solutions di- procedure all coefficients and control words reside in 

rected to increasing the speed of these filters for use in 65 the correct registers. 

real time digital signal processing applications. This filter uses only powers of two as coefficients. As 

In the prior patent art, U.S. Pat. No. 3,979,701 dis- binary multiplication by powers of two is nothing but a 

closes a non-recursive digital filter composed of a cas- shift of the multiplicand, complex multiplication is re- 
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placed by a simple shifter in this case. With the use of 
only, negative powers of two as coefficients, the shift 
operation is simplified to only right shift. Assuming the 
data (multiplicand) is always positive this shifter can 
handle both positive and negative coefficients. For the 
example described and illustrated, the output of this 
multiplier is one's complement 16 bit data words with 
the most significant bit as sign bit and fifteen magnitude 
bits, hence enabling the provision of a maximum seven 
bit shift for an eight bit multiplicand, which is equiva- 
lent to a multiplication by 2-^. This is the most negative 
power of two which such a multiplier can handle. Thus, 
properly programmed it can multiply the multiplicand 
with any one of the following values 

0, ±20 ±2-', ±2-2 ±2-7 



10 



13 



However, this limitation can be overcome by an ex- 
pansion of the coefficient space. The power-of-two 
coefficient space can easily be extended beyond the 20 
current limit of ±2-'^ by modifying the shifter and 
certain other elements which handle its output. The 
overall architecture of the shifter would not change; but 
only would extended to include the increase of coeffici- 
ent space. 25 

Similarly, the architecture is not limited to data 
words of 8 bits. To handle longer data words of any 
arbitrary size all that is needed is an increase of the size 
of the N AND gatQ sets in the shifter from 8 to whatever 
the data size desired is. Obviously, the ADDER size 30 
will have to be increased or decreased according to the 
maximum value of both the coefficient space and data 
word size. 

The proposed FIR filter structure is fully program- 
mable for a coefficient space C such that: 35 

[Cc{a ±2°, ±2-1, ±2-2 ±2-''>l- 



To maintain programmability of the filter, aU possible 
shifts of the data inputs are provided for, using a multi- 40 
plexer whose input is the output of sets of NAND gates, 
having positive inputs and a selection line. Outputs of 
the NAND gates are shifted and hard-wired to the 
multiplexer. The one's complement conversion of the 
multiplexer output is accomplished by a set of EX-OR 45 
gates. Thus, using a shifter and a decoder, NAND gates 
and EX-OR gates, the filter multiplies a coefficient by a 
data word without a standard multiplier. 

The filter has been successfully simulated using vari- 
ous coefficients and random data. SO 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of the classic realization of 
an FIR filter; 

FIG. 2 is a circuit diagram of the filter of the present 55 
invention; 

FIG. 3 is a circuit diagram of each tap of the filter of 
FIG. 2; 

FIG. 4g is a diagram of the coefficient word for the 
filter of FIG. 2; 60 

FIG. 46 shows the shifts of the multiplicand for posi- 
tive coefficients of the filter of FIG. 2; 

FIG. 5 is a logic diagram of the multiplier/shifter of 
the filter of FIG. 2; 

FIG. 6 is a circuit diagram of the adder of the filter of 
FIG. 2; 

FIG. 7 is a diagram showing the sequence of coeffici- 
ent loading and verification for the filter of FIG. 2. 



DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

Filtering is one of the most important functions of 
real time linear signal processing. Among the different 
types of digital filters there has been a great deal of 
interest in finite impulse response (FIR) digital filters 
(also called transversal filter). The reason for this is that 
powerful and mature optimization theories exist to aid 
in the filter design. FIR filters can easily be designed to 
approximate a prescribed magnitude/frequency re- 
sponse to arbitrary accuracy with an exactly linear 
phase characteristic. The non-recursive FIR filters con- 
tain only zeroes in the finite z-plane and hence are al- 
ways stable. These features make them very attractive 
for most digital signal processing applications. 

The FIR fdter is characterized by the input/output 
relation 



/=0 



(1) 



where Xfl_/=X(tn— iT5)is the sampled input signal and 
Y=Y(tn) is the corresponding output signal. T, is the 
sampling period, t/i=nTj are sample instances and 
f5= l/Tjis the sample rate. So each output sample is the 
weighted sum of a finite number of input samples (N in 
Eq. 1). 

The classical realization of Eq. 1 is illustrated in FIG. 
1, showing the well known semi-systolic parallel-in 
serial-out transversal filter architecture 20, in which the 
weighting of the input samples is accomplished by mul- 
tipliers. The data is broadcast via a data bus 22 globally 
to every tap in the filter, where it is multiplied in multi- 
plier 24 by a weighting factor (the coefficient) from a 
coefficient register (not shown) appearing on input 23 
and then added in adder 26 to a delayed output from 
delay 28 of the previous tap. Thus, the basic building 
blocks of transversal filter 20 are the multiplier 24, the 
adder 26 and the delay 28. The multiplier is the most 
time consuming and expensive building block of the 
filter, thus there has been a great effort to make multipli- 
cation operation cheaper and faster, to increase the 
overall speed of operation of the fdter. It is this 
weighting factor multiplier that is eliminated in the 
filter architecture of the invention. 

If the coefficient space is limited to only power of 
two, then the complex multiplication can be altogether 
replaced by a simple shift operation. This is the main 
feature of the FIR filter structure of the present inven- 
tion. It is obvious that this restriction on the coefficient 
space will effect the performance of the filter, and sub- 
stantial research has been conducted to compensate for 
this limitation. The most promising approach was out- 
lined in the cross-referenced application by Koo and 
Miron, which is the primary algorithm to be used by 
this structure. However, since this invention is a fully 
programmable filter, any power of two filter algorithm 
can be implemented. 

FILTER ARCHITECTURE 

With reference to FIG. 2, we will describe the archi- 
tecture of filter 10 of the invention hierarchically. For 
the example illustrated, the top level of the hierarchy 
shows a ten tap structure of the filter 10. Three busses 
go into each tap 30, the data bus 12, the coefficient bus 
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14 and the sum-in bus 16. Going out of each tap is the are read out of C-REG's 32 for verification and re- 
sum-out bus 18. The purpose and function of each bus loaded in a closed loop back to the C-REG*s 32 simulta- 
will become clear in the discussion of the filter opera- neously, so at the end of the verification cycle the 
tion, but it is clear that the structure is extremely regular CCW*s reside back at their respective C-REG's. To 
and modular. This is an extremely important feature in 5 keep circuit simplicity and the I/O pins of the chip 
the architecture from the VLSI implementation point of within limits, we choose not to use any CC W address 
^'®^* schemes. Our verification procedure is also a serial 
Referring to FIG. 2. the data bus 12 is the bus which procedure. In FIG. 2 the CI/0 bus is a bidirectional 
brings the broadcast data sample D-IN to each tap 30. bus, comprising C-IN bus 14 and C-OUT bus 15 the 
The coefficient bus 14 is a loop including the C-IN bus 10 direction of which is controlled by an external signal on 
and the C-OUT bus 15, and originates at C-I/O gates input 42 labeled C-RW. When the C-RW signal is set to 
41. 44. It brings the weighting factor information and l, bus 14 acts as an input bus. Bus 15 acts as an output 
the sum-m bus 16 bnngs the delayed output of the previ- bus when C-RW is set to 0. It is obvious that for the 
ous tap 30. The sum-out bus 18 is the output of each tap CCW loading procedure, C-RW is set to 1 and the 
30 and can be fed mto the sum-in input 16 of the next tap 15 cCWs are loaded as discussed above. For verification, 
30. Note that this architecture lends itself to cascading the C-RW signal is set to 0 and the CLK-C is clocked, 
very naturally and easily, and is by no means limited to At the first clock, the last coefficient (Coefficient #9) 

^^^^^i^r^^' I. u 1.. , . . appears at the output (C-OUT) 44 (FIG. 3) and at the 

HG. 3 shows the building blocks of each tap 30. same time is loaded back in C-REG #0. The next clock 

They will each be explained in detail later, for now we 20 cycle brings the next to last coefficient at the C-OUT 

will just id^tify them. The C-REG 32 is the register bus and simultaneously loads it into C-REG #0 while 

containing the coefficient and control word information pushing the current resident of C-REG #0 (coefficient 

CCW. This mformation is loaded in the initialization #9) to C-REG #1. At the end of the verification cycle 

ph^e of the fflterop^^^ Most of this mformation is (ten clock pulses) all the coefficients have been read out 

usedby theOTIFTER34for thecoiT^^^ 25 from the C-OUT bus 15 through gates 41 and at the 

weighting The output of the SHIFTER 34 is the same time reloaded through gates 4 into the C-REG's 

weighted data sample, which is latched by a pipelining 32 via C-IN bus 14. 
latch 36, used for speed enhancement purposes. The 

output of the latch 36 is added to the output of the POWER OF 2 MULTIPLIER AND ADbER 

previous tap by the ADDER 38 and the ADDER'S 30 This filter uses only powers of tWo as coefficients. As 

*^^^?"\'^X??J^L^.^^f m DELAY 40 by one time unit binary multiplication by powers of two is nothing but a 

of clock CLK-N before bemg passed on as the ADDER shift of the multiplicand, complex multiplication is re- 

input of the next tap. placed by a simple shifter 34 in this case. With the use of 

In the foUowmg section we wiU explam the coef- only negative powers of two as coefficients, the shift 

ficient/control word loading and non-destructive verifi- 35 operation is simplified to only right shift. Assuming the 

caUon procedure, dunng the Initialization phase of the data (multiplicand) has eight bits and is always positive, 

filter operation. multiplier can handle both positive and negative 

COEFFICIENT LOADING AND VERIFICATION coefficients. Output of this muitipUer is one's comple- 

, . ^ , . . . , ^ent 16 bit data with the most significant bit (MSB) is 

Filter 10 works m two phases: the mitialization phase 40 sign bit and fifteen magnitude bits, hence keeping the 

and the normal operation phase. The coefficients and provision of maximum seven bit shift for the eight bit 

control words (CCW) are loaded during the first phase. multiplicand, which is equivalent to a multiplication by 

The C-REG 32 registers are connected to each other in 2-?. This is the most negative power of two which this 

a serial chain, hence the loading is serial. Consider the multipUer can handle and which the filter requires as 

particul^ case of a ten tap filter. CCW #9 is appUed to 45 well. Otherwise, properly programmed, it can multiply 

the coefficient input bus (C-IN) 14 of FIG. 2 and HG. the multiplicand with any one of the following values 
3 and all the C-REG's are clocked by the clock CLK-C 

(discussed later). It is obvious that CCW #9 wUl be 0, ±2-0 ±2-K ±2-\ .... ±23» \ 

loaded in c-REG #0. Now CCW #8 is applied to the FIG. 4a shows the format of a control word (CCW) 

CI/O bus 14 and all the C-REG's are clocked again. 50 and FIG. 4/> shows how the shift operation takes place 

This time CCW #9 shifts from C-REG #0 to C-REG for all positive values of coefficients. 

#1 and CCW #8 loaded in C-REG #0. This procedure FIG. 5 shows the logic diagram which implements 

is repeated ten times. Each time a new CCW is applied the above mentioned one's complement multiplier/- 

to the CI/O bus 14 and all the C-REG's 32 are clocked, shifter 34 operation. To maintain the programmability, 

that particular CCW is loaded in C-REG #0 and all the 55 in every stage of this multiplier we have incorporated 

other CCWs are shifted to the next C-REG 32 in the all possible above shown shifts and that is done by a 

serial chain. At the end of the cycle (ten clock pulses in simple multiplexer whose input section 46 consists of 

this case), all the CCW are loaded in the correct order eight sets of eight NAND gates. Each of the sets are fed 

in the C-REG's 32. Summarizing, the CCW furthest by input lines 47 with the eight magnitude bits of the 

away from the CI/O bus 14 is loaded first; the next 60 positive (thus not requiring any sign bit) multiplicand as 

furthest away is next, and so on till all the CCW's have shown in the left side of FIG. 5. Each set of NAND 

been loaded. Each new CCW pushes all the old CCW's gates has a separate selection line 48. Which one of 

one register over, and at the end of the cycle the loading these sets of NAND gates 46 will be selected depends 

I? cpmplete. _ upon the. value of the coefficient programmed for that 

The next step is verification of the above loading. The 65 stage in a C-REG 32. Outputs 49 of the selected set of 
purpose is to make sure all the CCW's have been loaded NAND gates 46 are property shifted and hardwired to 
in their required destination registers. We utilize a non- the output stage SO of the multiplier/shifter 34 to pro- 
destructive verification procedure by which the CCW's duce a fifteen bit magnitude AA bus 52 as shown in 
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nS;^- g°S^L:rif mtSsS T^l^. POSSrBLE EXTENSION OF THE COEFFICIENT 

tiplier shifter circuit 34, the one's complement conver- 
sion logic circuit 54, which is a set of EX-OR gates, The power-of-two coefficient space can easily be 
produces the bit by bit complement of AA bus 52, in 5 extended beyond the current limit of ±2-^ by modify- 
case of negative coefficients. As the multiplicand is ing the shifter 34. For example, to increase the space to 
always positive, the sign of the multiplied output on bus ±2- *5 requires one more bit for the coefficient value 
56 depends on that of the coefficient. This sixteen (fif- (from 3 to 4) and a 4-to-16 decoder. It also requires 
teen bit magnitude together with sign bit) bit bus 56 is s«teen sets of 8 AND gates (instead of the currently 
latched in latch 36 (FIG. 3) and delivered to the adder ^^d 8 sets). Thus, the overall architecture of the shifter 
38 (FIGS. 3 and 6). ^ change; it is only extended to include the 

The section foUowing the multiplier/shifter 34 and increase of coefficient space, 

latch 36 is an adder 38 (FIG. 6). It is a partially (4 bit) Sinularly, filter 10 is not Imuted to a data word of 

full carry look ahead adder, cascaded to form a twenty , . «ght bits. To increase this to any arbitrary size requires 

bit adder. It receives the sixteen bit output of multi- on^V ^ '^^^^^ in the number of NAND gates per set 

plier/shifter 34 via pipelining latch 36 as one of its in. *=^Sht to what ever the data size desired is. Obvi- 

puts 58, and the other input 60 is the twenty bit latched ^^JV' ADDER 36 size (number of bits) will have to 

output of delay 40 of the previous stage. The sign-exten- be mcre^ed or decreased accordmg to the maximum 

sion technique is used for the smaller number in the 20 coefficient space and data word size, 

adder. The sign of the smaller number which is in its VLSI IMPLEMENTATION AND SIMULATION 

one's complement form is also fed to the carry input of -m.- r-i* m 1 * j • o- ^ 

the adder and thereby converting the smaller number to ^ ?f " was implemented using Signet.cs Cor- 

its two's complement form. The foUowing explanation P*''**^:" ^ ^ micron double me al standard cell technol- 

, , 1 r • /OA u-f\ r*u I.- ogy- The cuTreut chip has ten taps. The IC lay out was 

determmes the selection of the sue (20 bm) of the big- 25 ^^^^^ ^avlt Lisco's CAL-MP software. We 

ger number At a particular stage or tap this number is simulation results of the various 

the accumulated result of multiplier outputs of all the ^^^^ ^ .^^ .j,^ 
Stages up to that stage. So larger the number of stages, 

the larger the accumulated result (i.e., size) becomes. In INITIALIZATION PHASE 

the design of the chip, beside programmabQity, we in- 30 ^^^^^ the coefficient loading part of FIG. 7 

corporated Its cascadibility as well. Expenence shows ^^^^ 2, we desire our filter coefficients to be as 

that a maximum of sixteen taps for a fdter with only ^^^^ -^^ j^y^^^ presented them to the CIN bus 14 

power of two coefficients is a good choice to cover ^^^^^^ f^gjjion and at the end of ten clock pulses we 

most ofthe video apphcations.Considermg this fact the ^ ^^^^ ^^^y ^re in the correct C-REG's 32. As 

size (20 bits) is such that it wUl produce no overflow in mentioned earlier, for loading coefficients the C-RW 42 

the accumulated result of all sixteen stages after casca- signal is set to 1. 

dation, even if each of the multipliers produces the Now for the verification of our loading the C-RW 42 

largest possible output 7F80 (that takes place for largest signal is set to 0 and again supply ten coefficient clock 

possible eight bit multiplicand FF and the largest possi- ^^^^^^ xhe coefficient verification and reloading por- 

ble coefficient 1). To achieve this we need one sign bit, tion of FIG. 7 and Table 2 show that all of the coeffici- 

fifteen magnitude bits for each multiplied output and ents appeared at the C-IN bus 14 (in reverse order) and 

log24 its for sixteen (2^) stage accumulation of them, aj-g reloaded into the C-REG's 32. This concludes the 

which is 20 bits in total. simulation of our initialization procedure. 

PROGRAMMING THE FILTER 45 NORMAL OPERATION 

As mentioned earlier, the FIR filter structure 10 is Por normal operatiooi the coefficient clock enable 

fully programmable for a coefficient space C such that: signal (CEN) is set to I. Since, once initialization of the 

. Q _^ _2 - successfully accomplished, we do not 

[Ceio. ±2 , ±2 , ±2 ±2 }] coefficients to change, this signal is an added 

_ . , . , ^ , protection against a stray coefficient clock pulse (CLK- 

The CCW contams the information of the actual shift, q altering the coefficients. Data is presented via the 

as well as the sign of the coefficient. It also contains ^.j^ bus U and the normal operation clock (CLK-N) 

information for a coefficient of magnitude 0. The Coef- now becomes the only system clock and filtering is 

ficient and Control Word CCW is a 5 bit word, 3 least 55 performed on the data. 

significant bite, bits defining the power of the coeffici- We have successfully simulated the filter using vari- 

ent, the next bit controlling the sign and the most signifi- qus coefficients and random data, 
cant bit for zero coefficient. This is shown in FIG. 4a. 

The three coefficient power control bits pass through a _____ 

decoder 62 (FIG. 3) and select one shift from 2°. 2-"^, fin coefficient binary decimal desired 

the sign control bit determines the sign ofthe coeffici- control word value value coefhcient 

ent. In case of zero coefficient (which is a special case ^ ^ ^ 

ci 00000 0 0 

since It IS not a power of 2) bit number is set to 1. q2 moo -2-* 

The entire programming operation is elaborated in C3 101 n 23 2-^ 

the example pfTable I. In the Table I the desired coeffi- 65 C4 10000 I6 2° 

cient and its corresponding CCW code is shown. It is 24 _20 

obvious the use of the decoder 62 enables us to reduce C7 lOOOi n 2-* 

the number of I/O pins required for the CCW. C8 lono 22 2-^ 
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TABLE 1-continued 



COEFFICIENT 


BINARY DECIMAL 


DESIRED 


CONTROL WORD 


VALUE VALUE 


COEFFICIENT 


C9 


UOn 27 


-2-3 


TABLE 2 


TIME CO CI 


C2 C3 C4 C5 C6 


C7 C8 C9 



COEFFICIENT LOADING PROCEDURE 



0.0 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X 


100.0 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X 


200.0 


27 


X 


X 


X 


X 


X 


X 


X 


X 


X 


300.0 


22 


27 


X 


X 


X 


X 


X 


X 


X 


X 


400.0 


17 


22 


27 


X 


X 


X 


X 


X 


X 


X 


500.0 


24 


17 


22 


27 


X 


X 


X 


X 


X 


X 


600.0 


0 


24 


17 


22 


27 


X 


X 


X 


X 


X 


700.0 


16 


0 


24 


17 


22 


27 


X 


X 


X 


X 


800.0 


23 


16 


0 


24 


17 


22 


27 


X 


X 


X 


900.0 


28 


23 


16 


0 


24 


17 


22 


27 


X 


X 


1000.0 


0 


28 


23 


16 


0 


24 


17 


22 


27 


X 


1100.0 


18 


0 


28 


23 


16 


0 


24 


17 


22 


27 






COEFFICIENT VARIFICATION 








AND RELOADING PROCEDURE. 






3300.0 


18 


0 


28 


23 


16 


0 


24 


17 


22 


27 


3350.0 


27 


18 


0 


28 


23 


16 


0 


24 


17 


22 


3450.0 


22 


27 


18 


0 


28 


23 


16 


0 


24 


17 


3550.0 


17 


22 


27 


18 


0 


28 


23 


16 


0 


24 


3650.0 


24 


17 


22 


27 


18 


0 


28 


23 


16 


0 


3750.0 


0 


24 


17 


22 


27 


18 


0 


28 


23 


16 


3850.0 


16 


0 


24 


17 


22 


27 


18 


0 


28 


23 


3950.0 


23 


16 


0 


24 


17 


22 


27 


18 


0 


28 


4O5O.0 


28 


23 


16 


0 


24 


17 


22 


27 


18 


0 


4150.0 


0 


28 


23 


16 


.0 


24 


17 


22 


27 


18 


4250.0 


18 


0 


28 


23 


16 


0 


24 


17 


22 


27 
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We claim: 

1. A non-recursive finite impulse response (FIR) digi- 
tal filter of n taps, where l^n, comprising: 

means to receive and store a plurality of coefficients, ^5 
one coefficient per tap, wherein each coefficient is 
an integral power of 2; 

means to receive and shift a data input word in each 
of said taps, the extent of said shift being deter- 
mined by the value of the coefficient stored in said 40 
tap; 

said shifted data word being equivalent to the prod- 
uct of said input data word and said coefficient of 
said tap; 

means to add said product at each tap to the delayed 45 
output of the previous tap to thereby form at each 
tap the cumulative sum of the products at all previ- 
ous taps; 

means to delay and forward said cumulative sum to 
the next tap, and SO 

means to output the cumlative sum of all taps in said 
filter. 

2. The filter of claim 1 wherein said means to receive 
and store a plurality of coefficients comprises: 

input means to load said coefficients into said filter in 55 

an initialization phase; 
a coefficient bus to transport each of said coefficients 

to a respective destination tap for each coefficient; 

and 

a coefficient register at each tap to store the coeffici- 
ent for said tap. 

3. The filter of claim 2 wherein: 
said coefficient bus is a serial bus; 
said coefficient registers are connected to each other 

in a serial chain; and 
said coefficients are loaded into said registers such 
that the coefficient for the last register in the chain 
is loaded furst and each coefficient is shifted down 



60 



65 



said chain in inverse relationship to the proximity 
of such coefficient to the start of said chain. 

4. The filter of claim 3 wherein said coefficient bus is 
a bidirectional bus, and further comprising: 

means to reverse the direction of serial bit flow on 

said coefficient bus; 
output terminal means on said coefficient bus; and 
means to verify the loading of said coefficients by 
reading said coefficients out serially to said output 
terminal means when said bus direction is reversed 
and reloading said coefficients in said destination 
taps. 

5. The filter of claim 4 wherein said coefficient regis- 
ter stores a coefficient control word having five bits, 
three for the coefficient, one for its sign and one bit for 
a zero coefficient. 

6. The filter of claim 4 further including a decoder to 
decode the output of said coefficient register. 

7. The filter of claim 1 wherein said means to receive 
and shift said data input word comprises: 

a data bus connected to each of said taps, whereby 
said data input words are broadcast to each of said 
taps; 

a shifter in each tap in the form of a NAND-NAND 
multiplexer connected to said data bus and to said 
coefficient register of said tap, said shifter receiving 
each data word sequentially and shifting the bits 
thereof according to the coefficient stored in said 
coefficient register, the output of said shifter being 
weighted data sample. 

8. The filter of claim 7 wherein said coefficients are 
only negative powers of two and said shifter shifts the 
bits of each data input word to the right only. 

9. The filter of claim 1 where the coefficient space of 
said filter is limited to Q-^. 

10. The filter of claim 9 further including: 

means to expand the filter length by cascading stages 
of said filter. 

11. The filter of claim 1 wherein said filter is fully 
programmable. 

12. The filter of claim 11 wherein said fully program- 
mable filter includes means to provide up to a given 
number of shifts of the bits of said data input word. 

13. The filter of claim 1 wherem said filter is limited 
to sixteen taps without any overflow. 

14. The filter of claim 1 wherein said means to add 
said products comprises: 

a sum-in bus which transports the cumulative sum of 
previous weighted data samples to each tap; 

an adder to add the output of said shifter of each tap 
to the cumulative sum; 

a delay element; 

a sum-out bus which transports the sum of said adder 
to said delay element, said delay element connected 
to the sum-in bus of the next tap. 

15. The filter of claim 14 wherein for an eight bit data 
word and a coefficient of ±2-^ the output of said 
shifter is one sign bit and 15 magnitude bits in one's 
complement. 

16. The filter of claim 14 wherein said shifter for an 
eight bit data word comprises: 

a shifter having an input section has eight sets of first 
NAND gates, each set receiving all eight bits of 
said data word; 

a selection line from said coefficient register to each 
of said sets of NAND gates, the coefficient register 
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sending a selection signal over such selection lines 
which selects one of said sets of first NAND gates; 
a second set of NAND gates to receive the output of ^ 
the selected first set of NAND gates, the combina- 
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tion of the two sets of NAND gates providing the 
required shift as a shifter; and 
a set of EX-OR gates at the output of said shifter to 
perform one*s complement conversion on the out- 
put of said shifter, which is latched for transfer to 
said adder. 

* * * ♦ * 
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